Picture for Hang Li

Hang Li

NEC Corporation

Whole-Pool Setwise Reranking with Long-Context Language Models

Add code
Jun 01, 2026
Viaarxiv icon

Task-Focused Memorization for Multimodal Agents

Add code
May 29, 2026
Viaarxiv icon

Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics

Add code
Apr 22, 2026
Viaarxiv icon

Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA

Add code
Apr 15, 2026
Viaarxiv icon

Q-Mask: Query-driven Causal Masks for Text Anchoring in OCR-Oriented Vision-Language Models

Add code
Mar 31, 2026
Viaarxiv icon

Can MLLMs Read Students' Minds? Unpacking Multimodal Error Analysis in Handwritten Math

Add code
Mar 26, 2026
Viaarxiv icon

SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

Add code
Mar 24, 2026
Viaarxiv icon

TransText: Alpha-as-RGB Representation for Transparent Text Animation

Add code
Mar 19, 2026
Viaarxiv icon

ReMem-VLA: Empowering Vision-Language-Action Model with Memory via Dual-Level Recurrent Queries

Add code
Mar 13, 2026
Viaarxiv icon

Optimizing In-Context Demonstrations for LLM-based Automated Grading

Add code
Feb 28, 2026
Viaarxiv icon